---
title: "CFA Proofs for Latent Populism Variable"
format: html
toc: true
code-fold: true
self-contained: true
css: styl.css
---

```{r}
#| echo: true
#| message: false


library(here)
library(tidyverse)
library(lavaan)

df_integrated <- readRDS(here("final_data_v2/poppa_integrated_v2.rds"))

```

These materials present the confirmatory factor analysis (CFA) models used to generate the populism latent variable. For a comprehensive discussion, see *The State of Populism: Introducing the 2023 Wave of the Populism and Political Parties Expert Survey, published* in *Party Politics*. Since the publication of the article, one previously missing party has been added to the dataset. As a result, the estimates may differ slightly from the article; however, the underlying CFA structure remains unchanged and valid. The full set of CFA models are estimated, including the baseline model as well as models incorporating one, two, and three residual covariances. The model specifying three residual covariances is the version used in the final dataset. The dataset has a variable consisting of the regression scores from the model with three residual covariances (i.e. standardized) ([variable name: populism_cfa]{.variable2}) and a variable that rescales this variable from 0 to 10 ([variable name: populism_cfa_rescaled]{.variable2}). 

<br>

## Baseline Model

```{r}
#| echo: true

# We fit the CFA for the populist items. 

# 1. This is the Baseline Model. 

# Define the model

cfa_basic_full <- '

populism_cfa_basic =~ manichean + generalwill + peoplecentrism + antielitism + indivisible

'

# Fit the model to the data

fit_basic_full <- cfa(model = cfa_basic_full, data = df_integrated)


# Summarize the results

summary(fit_basic_full, fit.measures = TRUE, standardized = TRUE)

```

###  Modification Indices for Baseline Model

```{r}
#| echo: true

modificationIndices(fit_basic_full, sort = TRUE)

```

The iterations of the baseline model converge normally, and the model has five degrees of freedom. However, the `fit_basic_full` model does not demonstrate a particularly good fit.

## Model with One Residual Covariance

The modification index (MI) from the baseline model indicates a high value for the residual covariance between [generalwill ~~ indivisible]{.variable}, suggesting that including this parameter would substantially improve model fit. As such, in the model with a one residual covariance the residuals of [generalwill ~~ indivisible]{.variable} are allowed to covary. Theoretically, it is plausible that experts interpreted these items in a similar manner, leading to correlated residuals. Specifying this residual covariance therefore accounts for shared variance attributable to such similarities in interpretation.

<br>

```{r}
#| echo: true

# Define the model with residual covariances

cfa_cov_1_full <- '

  populism_cfa_cov_1 =~ manichean + generalwill + peoplecentrism + antielitism + indivisible
  
  generalwill ~~ indivisible
  
'


# Fit the model to the data

fit_cov_1_full <- cfa(model = cfa_cov_1_full, data = df_integrated)


# Summarize the results

summary(fit_cov_1_full, fit.measures = TRUE, standardized = TRUE)

```

###  Modification Indices for Model with One Residual Convariance

```{r}
#| echo: true

modificationIndices(fit_cov_1_full, sort = TRUE)

```

The model converges normally and has four degrees of freedom. The `cfa_cov_1_full` model yields an improved fit compared to the baseline, though further improvement remains possible.

## Model with Two Residual Covariances (First Attempt)

The modification index (MI) from the model with one residual covariance indicates a high value for the residual covariance between [manichean ~~ peoplecentrism]{.variable}, suggesting that including this parameter would further enhance model fit. In the subequent model with two residual covariances, as in the previous model, the residuals of [generalwill ~~ indivisible]{.variable} are allowed to covary, and an additional covariance is introduced between [manichean ~~ peoplecentrism]{.variable}. 

There are theoretical grounds to expect that these two items may share residual variance due to overlapping conceptual content. However, as will be shown below, a potential issue arises concerning a negative correlation between these items.

<br>

```{r}
#| echo: true

cfa_cov_2_full_manichean <- '

populism_cfa_cov_2 =~ manichean + generalwill + peoplecentrism + antielitism + indivisible

generalwill ~~ indivisible
manichean ~~ peoplecentrism

'


# Fit the model to the data

fit_cov_2_full_manichean <- cfa(model = cfa_cov_2_full_manichean, data = df_integrated)


# Summarize the results

summary(fit_cov_2_full_manichean, fit.measures = TRUE, standardized = TRUE)

```

###  Modification Indices for Model with Two Residual Covariances (Manichean)

```{r}
#| echo: true

modificationIndices(fit_cov_2_full_manichean, sort = TRUE)

```

There are issues with this model. Specifically, a negative residual correlation emerges between [manichean ~~ peoplecentrism]{.variable}. Running the modification indices confirms a problematic correlation between these two items.

## Model with Two Residual Covariances (Second Attempt)

In the previous specification, the residuals of [manichean ~~ peoplecentrism]{.variable} were allowed to covary, which resulted in a negative residual covariance and subsequent estimation issues. Therefore, a revised model is estimated. The modification index (MI) from the model with one residual covariance ([See Modification Indice above](#modification-indices-for-model-with-one-residual-convariance)) indicates a high value for the residual covariance between [peoplecentrism ~~ antielitism]{.variable}, suggesting that including this parameter would substantially improve model fit. We, therefore, refit the CFA model and include a residual covariance for [peoplecentrism ~~ antielitism]{.variable}. The complete model now includes residual covariances for [generalwill ~~ indivisible]{.variable} (as in the previous model) and residual covariances for [peoplecentrism ~~ antielitism]{.variable}.  

There are also theoretical grounds for expecting these two items to share residual variance due to conceptual overlap.

<br>

```{r}
#| echo: true

cfa_cov_2_full <- '

populism_cfa_cov_2 =~ manichean + generalwill + peoplecentrism + antielitism + indivisible

generalwill ~~ indivisible
peoplecentrism ~~ antielitism

'


# Fit the model to the data

fit_cov_2_full <- cfa(model = cfa_cov_2_full, data = df_integrated)


# Summarize the results

summary(fit_cov_2_full, fit.measures = TRUE, standardized = TRUE)

```

###  Modification Indices for Model with Two Residual Covariances (Second Attempt)

```{r}
#| echo: true

modificationIndices(fit_cov_2_full, sort = TRUE)

```

The model converges normally with three degrees of freedom. The model fit is satisfactory, although there remains some scope for further improvement.

## Model with Three Residual Covariances

Upon examining the modification indices again, two sets of items display high values: [manichean ~~ peoplecentrism]{.variable} and [manichean ~~ antielitism]{.variable}. Given the earlier issues associated with the covariance between [manichean ~~ peoplecentrism]{.variable}, we specify a residual covariance between [manichean ~~ antielitism]{.variable}.

This model therefore allows the residuals of [generalwill ~~ indivisible]{.variable} to covary and those of [peoplecentrism ~~ antielitism]{.variable} (as in the previous specifications), and now additionally the model introduces a residual covariance between [manichean ~~ antielitism]{.variable}. 

There are also theoretical grounds for expecting these two items to share residual variance due to their related conceptual content.

<br>

```{r}
#| echo: true

cfa_cov_3_full <- '

populism_cfa_cov_3 =~ manichean + generalwill + peoplecentrism + antielitism + indivisible

generalwill ~~ indivisible
peoplecentrism ~~ antielitism
manichean ~~  antielitism

'

# Fit the model to the data

fit_cov_3_full <- cfa(model = cfa_cov_3_full, data = df_integrated)


# Summarize the results

summary(fit_cov_3_full, fit.measures = TRUE, standardized = TRUE)

modificationIndices(fit_cov_3_full, sort = TRUE)

```

The model converges normally with two degrees of freedom and demonstrates an excellent fit. The modification indices indicate no remaining high values, suggesting that no further model adjustments are necessary. We, therefore, conclude the process of fitting the CFA models. As, noted above, the CFA model with three residual covariances is the model that is used to produce the regression score and the populism latent variable, as a standardized and rescaled variable. 
